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Zinc fingers are small DNA-binding motifs which occur in a large family of 



eukaryotic transcription factors ^. The DNA-binding specificity of zinc fingers can 
be altered by protein engineering, for instance using phage display ^, to create novel 
protein domains which recognise predetermined sequences. It has been proposed that 
tailored DNA-binding domains of this type can be incorporated into proteins such as 
restriction enzymes ^ and transcription factors in order to target particular DNA 
sequences or genes. The zinc finger domains studied so far - whether naturally 
occurring, designed or selected - can bind specifically to various DNA sites containing 
the four major DNA bases: A, G, C and T. However, the DNA of many organisms 
also contains a fifth base, 5-methylcytosine (5-meC), which arises from specific 
methylation of cytosine, and which is used to mark the genome or to increase its 
information content ^. 5-methylcytosine is well known to affect protein-DNA 
interactions for instance inhibiting cleavage of DNA by certain restriction enzymes. 
In vertebrates, cytosine is frequently methylated when directly preceding guanine, as in 
the dinucleotide CpG 1 1 . This type of methylation generally down-regulates vertebrate 
gene expression, and can also prevent the binding of many eukaryotic transcription 
factors to DNA llj 12 . Yet the zinc finger transcription factors tested to date, Spl and 
YY1, are not affected by CpG methylation of their DNA binding sites 14^ 
suggesting that zinc fingers are incapable of discriminating between cytosine and 5- 
meC. On the contrary, it is shown below, that phage-selected zinc finger DNA-binding 
domains can distinguish the two closely related bases in the context of the CpG 
dinucleotide and are hence capable of differential binding to their DNA sites depending 
on the methylation status of cytosine. 

A phage display library of the three-finger DNA-binding domain from the 
mouse transcription factor Zif268 has been described in which finger 2 was 
randomised in those positions of the DNA-recognition helix which were thought to 
function in DNA binding (Fig. la). The library was screened using a version of the 
Zif268 DNA binding site, GCGTGGGCG, in which the triplet bound by the 
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randomised middle finger (underlined) was replaced by a given target sequence. When 
the sequences GCGGCGGCG and GCG GTG GCG were used in selections, some zinc 
finger DN A-binding domains were selected which bound both sequences equally well 
(Fig. lb, c) 15 * *6 However, two additional zinc finger families were isolated which 
were capable of differential binding to the two closely related sites (Fig. lb, c) ^» 
Sequence-specific recognition required discrimination of the central base in the binding 
site by amino acids in position 3 of the recognition helix of the selected zinc fingers, 
and it was noted that aspartate was selected to bind opposite cytosine in the triplet 
GCG, while alanine was selected opposite thymine in the triplet GTG . The correlation 
between thymine and alanine was particularly significant, as it implied a van dcr Waals 
interaction between the amino acid side-chain and the 5-methyl group of the base 
Indeed, when thymine was mutated to deoxyuracil in the binding sites of such fingers 
there was a dramatic decrease in the strength of the intermolecular interaction (Fig lc). 
This showed that these zinc fingers were capable of specifically recognising a 5-methyl 
group, and suggested that similar fingers might be selected which bind 5-meC by the 
same token. In order to determine whether this was indeed possible, the phage display 
library was screened with the synthetic binding site GCG GMGG CG, containing a 5- 
meC base analogue (M). After 5 rounds of selection, zinc finger phage were tested for 
binding to 5-mcC and cytosine in the context of the above site, and those capable of 
specifically binding the methylated site were sequenced in the region of the zinc finger 
gene. Two different clones were isolated, which were identical to the DNA-binding 
domains previously selected using the binding site GCG GTGG CG. 

Hence the various zinc finger phage selections described above yielded different 
fingers able to bind the generic DNA sequence GCG GNGG CG. where N was either 
thymine, cytosine or 5-meC. A full complement of fingers was selected for recognition 
of the cytosine/5-rneC pair in the above context, some of which recognised one type of 
base exclusively, while others bound both bases equally well (Figures lc and 2). 
However, any fingers which recognised 5-meC were unable to discriminate against 
thymine (Fig. lc). 
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The zinc finger amino acid residues which were selected by the interaction 
between the randomised recognition helix and the central base of the DNA binding site 
could be rationalised in terms of previously elucidated zinc finger-DNA recognition 
rules 18 . Fingers with alanine in position 3 of the recognition helix specifically bound 
5-meC and thymine owing to a tight hydrophobic interaction between the side chain and 
the 5-methyl group which is present in both bases. In contrast, a finger with valine in 
position 3 was also able to accommodate cytosine in addition to the two methylated 
bases, presumably by the use of different rotamers. Fingers with aspartate in position 3 
bound cytosine specifically, perhaps by forming a ring structure which packs against 
the pyrimidine as was observed in the refined crystal structure of Zif268 ^» although it 
is noted that in this case deoxyuracil, which is bound only weakly, might have been 
expected to be equally acceptable (Fig. lc). 

Zinc fingers were selected from a phage display library to bind the minor base 
5-meC but not cytosine, and thus to discriminate the methylation status of cytosine in 
the context of a CpG dinucleotide. Since this DNA-binding motif is capable of 
discriminating 5-meC, the zinc finger transcription factor Spl may have evolved to 
recognise its DNA binding sites regardless of their methylation status. On the other 
hand, more specialised zinc finger transcription factors may respond to methylation of 
their binding sites in the same way demonstrated for phage-selected zinc finger DNA- 
binding domains. Moreover, because the DNA binding differences shown in this paper 
resulted from recognition of only a single methyl functional group, zinc fingers which 
specifically discriminate methylation of multiple CpG dinucleotides in their binding 
sites should be capable of a more pronounced response to methylation of DNA. 

Recently, further zinc fingers which bind methylated DNA have been selected 
from phage display libraries, and discrimination between cytosine and 5-meC has also 
been rationally engineered into tailored zinc finger domains (unpublished). Zinc finger 
DNA-binding domains designed to discriminate 5-meC in predetermined sequences are 
potentially powerful tools which may find applications in the study and manipulation of 
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DNA methylation, mutagenesis, restriction/modification systems, gene dosage 
compensation, genomic imprinting, and the control of gene expression. 
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FIGURE LEGENDS 



Fig la Alignment of the amino acid sequence of the three fingers from Zif268 used in 
the phage display library. Randomised residue positions in the a-helix of finger 2 are 
marked 'X' and are numbered above the alignment relative to the first helical residue 
(position 1). Residues which form the hydrophobic core are circled; zinc ligands are 
written as white letters on a black circle background; and positions comprising the 
secondary structure elements of a zinc finger are marked below the sequence. 

Fig lb Amino acid sequences of the variant ct-helical regions from some zinc fingers 
selected by phage display using the DNA binding site GCGG£I£IGCG where the 
central (bold) nucleotide of the middle (underlined) triplet was either: (i) 5- 
methylcytosine, (ii) thymine, or (iii) cytosine. Amino acid sequences are listed below 
the DNA oligo used in their selection. Amino acid positions are numbered above the 
aligned sequences relative to the first helical residue (position 1). Circled residues (in 
position 3) are predicted to contact the middle nucleotide of the binding site. 

Fig lc Phage ELISA binding assay showing discrimination of pyrimidines by 
representative phagc-selected zinc fingers. The matrix shows three different zinc finger 
phage clones (x, y and z) reacted with four different DNA binding sites present at a 
concentration of 3nM. Binding is represented by vertical bars which indicate the OD 
obtained by ELISA 16 . The amino acid sequences of the variant a-helical regions from 
the selected zinc fingers were: REDVLIRHGK (x) f RADALMVHKR (y), and 
RGPDLARHGR (z). The DNA sequences contained the generic binding site 
GCGGNGGCG, where the central (bold) nucleotide was either: uracil (U), thymine 
(T), cytosine (C), or 5-methylcytosine (M). 

Fig. 2 Effect of cytosine methylation on DNA binding by phage-selected zinc fingers. 
Graphs show three different zinc finger phage binding to the DNA sequence 
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GCGGCGGCG in ihe presence (ed!! key: circle) and absence (ed!! key: 
triangle) of methylation of the central base (bold). The zinc finger clones tested 
contained variant oc-helical regions of the middle finger as follows: (a) 
RADALMVHKR, (b) RGPDLARHGR and (c) REDVLIRHGK. These respective 
zinc finger clones preferentially bind their cognate DNA site in the presence, absence, 
or regardless of cytosine methylation. 
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RED^LIRHGK 



(iii) GCG GCG GCG 

-1123456789 
RGPDLARHGR 
REDVL I RHGK 
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